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(54) Fault resilient/fault tolerant computing 

(57) A method of synchronizing at least two comput- 
ing elements (CE1, CE2) that each have clocks that 
operate asynchronously of the clocks of the other com- 
puting elements includes selecting one or more signals, 
designated as meta time signals, from a set of signals 
produced by the computing elements (CE1 , GE2), mon- 
itoring the computing elements (CE1 . GE2) to detect the 
production of a selected signal by one of the computing 
elements (CE1). waiting for the other computing ele- 
ments (CE2) to produce a selected signal, transmitting 
equally valued time updates to each of the computing 
elements, and updating the clocks of the computing ele- 
ments (CE1 . CE2) based on the time updates. In a sec- 
ond aspect of the invention, fault resilient, or tolerant, 
computers (200) are produced by designating a first 
processor as a computing element (204), designating a 
second processor (202) as a controller, connecting the 
computing element (204) and the controller (202) to pro- 
duce a modular pair, and connecting at least two modu- 
lar pairs to produce a fault resilient or fault tolerant 
computer (200). Each computing element (202, 204) of 
the computer (200) performs all instructions in the same 



number of cycles as the other computing elements 
(202, 204). The computer systems include one or more 
controllers (202) and at least two computing elements 
(204). 
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Description 

Background of the invention 

[0001 ] The invention relates to fault resilient and fault s 
tolerant computing methods and apparatus. 
[0002] Fault resilient computer systems can continue 
to function in the presence of hardware failures. These 
systems operate in either an availability mode or an 
integrity mode, but not both. A system is "available" io 
when a hardware failure does not cause unacceptable 
delaysjn user access..and-a sy^em, operating , in an 
availability mode is configured to remain online, if possi- 
k)te, when faced with a hardware error. A system has 
data integrity when a hardware failure causes no data is 
loss or corruption, and a system operating in an integrity 
mode is configured to avoid data loss or corruption, 
even if i1 must go offline to do so. 
[0003] Fault tolerant systems stress both availability 
and integrity. A fault tolerant system remains available 20 
and retains data integrity when faced with a single hard- 
ware ^ilure, and, under some circumstances, with mul- 
tiple hardware failures. 

[0004] Disaster tolerant systems go one step beyond 
fault tolerant systems and require that loss of a comput- 25 
ing site due to a natural or man-made disaster will not 
interrupt system availability or corrupt or lose data. 
[0005] Prior approaches to fault tolerance include soft- 
ware checkpoint/restart, triple modular redundancy, and 
pair and spare. so 
[0006] Checkpoint/restart systems enploy two or 
more computing elements that operate asynchronously 
and may execute different applications. Each applica- 
tion periodically stores an image of the state of the com- 
puting element on which it is running (a checkpoint). 3S 
When a fault in a computing element is detected, the 
checkpoint is used to restart the application on another 
computing element (or on the same computing element 
once the fault is connected). To implement a check- 
point/restart system, each of the applications and/or the 4o 
operating system to be run on the system must be mod- 
ified to periodically store the image of the system. In 
addition: the system nYastbe capal3le ot'"backtracking" 
(that is. undoing the effects of any operations that 
occurred suk^equent to a checkpoint that is being 4S 
restarted) - 

[0007] With triple modular redundancy, three comput- 
ing elements run the same application and are operated 
in cycle-by-cycle lockstep. All of the computing ele- 
ments are connected to a block of voting logic that com- so 
pares the outputs (that is, the m^ory interfaces) of the 
three computing elements and, if all of the outputs are 
the same, continues with normal operation. If one of the 
outputs is different, the voting logic shuts down the com- 
puting element that has produced the differing output, ss 
The voting logic, which is located between the comput- 
ing elements and memory, has a significant impact on 
system speed. 



[0008] Pair and spare systems include two or more 
pairs of computing elements that run the same applica- 
tion and are operated in cycle-by-cycle lockstep. A con- 
troller monitors the outputs (that is. the memory 
interfaces) of each computing element in a pair. If the 
outputs differ, both computing elements in the pair are 
shut down. 

Summary of the Invention 

[0009] According to the invention, a fault resilient 
and/or fault tolerant.system is ok)jt^ned through us@ ot 
at least two computing elements ("CEs*0 that operate 
asynchronously in real time (that is, from cycle to cycle) 
and synchronously in so-called "meta time." The CEs 
are synchronized at meta times that occur often enough 
so that the applications running on the CEs do not 
diverge, but are allowed to run asynchronously between 
the meta times. For example, the CEs could be synchro- 
nized once each second and otherwise run asynchro- 
nously. Because the CEs are resynchronized at each 
meta time, the CEs are said to be operating in meta time 
lockstep. 

[0010] In particular embodiments, meta times are 
defined as the times at which the CEs request I/O oper- 
ations. In these embodiments, the CEs are synchro- 
nized after each I/O operation and run asynchronously 
between I/O operations. This approach is applicable to 
systems in which at least two asynchronous computing 
elements running identical applications always gener- 
ate I/O requests in the same order. This approach can 
be further limited to resynchronization after only those 
1/0 requests that modify the processing environment 
(that is. write requests). 

[0011] Meta time synchronization according to the 
invention is achieved through use of a paired modular 
redundant architecture that is transparent to applica- 
tions and operating system software. According to this 
architecture, each CE is paired with a controller, other- 
wise known as an I/O processor ("lOP"). The lOPs per- 
form any i/O operations requested by or directed to the 
CEs, detect hardware faults, and synchronize the CEs 
witfTf each otfier after eachr I/O cperatiorr.'^ In systems' in 
which I/O requests are not issued with sufficient fre- 
quency, the lOPs periodically synchronize the CEs in 
response to so-called "quantum interrupts" generated 
by inter -processor interconnect (I PI) modules con- 
nected to the CEs. 

[0012] In another particular embodiment of the inven- 
tion, rather than synchronizing the CEs based on each 
particular I/O operation, the CEs are synchronized 
teased on a window of I/O operations. In this approach, 
a list of I/O operations is maintained for each CE and 
the CEs are synchronized whenever a common entry 
appears in all of the lists. This approach allows flexibility 
as to the order in which I/O requests are generated. 
[0013] In yet another exemplary embodiment of the 
invention, the CEs are synchronized based either on 
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signals that are periodically generated by the operating 
system or on hardware generated interrupts. For exam- 
ple, in the hardware interrupt approach, a processor of 
each CE is modified to generate an interrupt e^ery N 
cycles and the CEs are synchronized in response to 5 
those Interrupts. 

[0014] Primary components of a paired modular 
redundant system include software, off-the-shelf lOPs. 
off-the-shelf CEs. and pairs of customized IPI modules 
that plug into expansion slots of the lOP and the CE and 10 
are interconnected by a cable. Redundant I/O devices 
can be connected to one or more of the CEs or lOPs to., 
provide redundant I/O and offer features such as vol- 
ume shadowing of key mass storage devices, A paired 
modular redundant system can accommodate any I/O is 
device that is compatible with a processor used in 
implementing an lOP of the system. 
[0015] The paired modular redundant architecture 
uses minimal custom software and hardware to enable 
at least two off-the-shelf computing elements to be com- 20 
bined into a fault resilient or tolerant system that runs 
industry standard operating systems, such as Windows 
NT, DOS. OS/2, or UNIX, and unmodified applications. 
Thus, the architecture can avoid both the high costs and 
Inflexibility of the proprietary operating systems, appli- 25 
cations, and processor designs used in the prior art. 
[0016] Another advantage of the paired modular 
redundant architecture of the present invention is that it 
offers a certain degree of software fault tolerance. The 
majority of software errors are not algorithmic. Instead, 30 
most errors are caused by asynchrony between the 
computing element and I/O devices that results in I/O 
race conditions. By decoupling I/O requests from the 
computing elements, the paired modular redundant 
architecture should substantially reduce the number of 35 
so-called "Heisenbug" software errors that result from 
such asynchrony. 

[001 7] In one aspect, generally, the invention features 
forming a ^ult tolerant or fault resilient computer by 
using at least one controller to synchronize at least two 40 
computing elements that each have clocks operating 
asynchronously of the clocks of the other computing 
elements: One or more signals, designated as meta 
time signals, are selected from a set of signals pro- 
duced by the computing elements. Thereafter, the com- 45 
puting elements are monitored to detect the production 
of selected signals by one of the computing elements. 
Once a selected signal is detected, the system waits for 
the production of selected signals by the other comput- 
ing elements, and, upon receiving the selected signals, so 
transmits equal time updates to each of the computing 
elements. The clocks of the computing elements are 
then updated based on the time updates. 
[001 8] Preferred embodiments of the invention include 
the features listed below. First, I/O requests are the ss 
selected signals. The I/O requests are processed to 
produce I/O responses that are transmitted with the 
time updates. In addition to. or instead of, I/O requests. 



quantum interrupts can be the selected signals. The 
computing elements count either executed instructions 
or the cycles of a clock such as the system clock, bus 
clock, or I/O clock, and generate quantum interrupts 
whenever a predefined number of instructions or cycles 
occurs. When both I/O requests and quantum interrupts 
are used as the selected signals, the computing ele- 
ments count the numt)er of instructions or cycles that 
occur without an I/O request. For example, a computing 
element could be programmed to generate a quantum 
interrupt whenever it processes for one hundred cycles 
without generating an I/O request. 
[0019] In one embodiment, instructions are counted 
by loading a counter with a predetermined value, ena- 
bling the counter witii an I/O request, decrementing the 
value of the counter, and signalling a quantum interrupt 
when the value of the counter reaches zero. In another 
approach, debugging features of the processor are 
used to generate the quantum interrupts. 
[0020] For fault detection, the selected signals and 
accompanying data, if any, from each of the computing 
elements are compared. If they do not match, a signal is 
generated to indicate that a fault has occurred. 
[0021] In some embodiments, the conrtputing ele- 
ments wait for time updates by pausing operation after 
producing the selected signals. The computing ele- 
ments resume operation upon receipt of the time 
updates. In other embodiments, the computing ele- 
ments continue operation after producing the selected 
signals. 

[0022] To avoid problems tiiat can be caused by asyn- 
chronous activities of the computing elements, the 
asynchronous activities are disabled. The functions of 
tiie asynchronous activities are tiien performed when a 
selected signal is produced. For example, normal mem- 
ory refresh functions are disabled and, in their place, 
burst memory refreshes are performed each time that a 
selected signal, such as an I/O request or a quantum 
interrupt, is produced. 

[0023] The invention also features a method of pro- 
ducing fault resilient or fault tolerant conputers by des- 
ignating a first processor as a computing element, 
designating a second processor as a controller; and 
connecting the computing element and the controller to 
produce a modular pair. Thereafter, at least two modular 
pairs are connected to produce a fault resilient or fault 
tolerant computer. The processors used for tiie comput- 
ing elements need not be identical to each other, but 
preferably they all perform each instruction of their 
instruction sets in the same number of cycles as are 
taken by the other processors. Typically, industry stand- 
ard processors are used in implementing tiie computing 
elements and the controllers. For disaster tolerance, at 
least one of the modular pairs can be located remotely 
from the other modular pairs. The controllers and com- 
puting elements are each able to run unmodified indus- 
try standard operating systems and applications. In 
addition, the controllers are able to run a first operating 
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system while the cotnputing 'elements simultaneously 
run a second operating system. 
[0024] I/O fault resilience is obtained by connecting 
redundant I/O devices to at least two nnodular pairs and 
transmitting at least identical 1/0 write requests and 
data to the redundant I/O devices. While I/O read 
requests need only be transmitted to one of the I/O 
devices, identical I/O read requests may be transmitted 
to more than one of the I/O devices to verify data integ- 
rity. When redundant I/O devices are connected to three 
or more modular pairs, transmission of identical I/O 
requests allows identification of aiaulty l/O device. 
[0025] In another aspect, generally the Invention fea- 
tures isolating I/O requests from computing operations 
in a computer through use of I/O redirection. Typically, 
I/O devices are accessed either through low level I/O 
requests or by directly addressing the I/O devices. Low 
level I/O requests include requests to the system's basic 
input output system (e.o.. BIOS), boot firmware 
requests, boot software requests, and requests to the 
system's physical device driver software. When a com- 
puting element issues a low level I/O request the inven- 
tion features using software to redirect the I/O requests 
to an I/O processor. When the computing element 
directly addresses the physical I/O devices, the inven- 
tion features providing virtual I/O devices that simulate 
the internees of physical I/O devices. Directly 
addressed t/O requests are intercepted and provided to 
the virtual i/O devices. Periodically, the contents of the 
virtual I/O devices are transmitted to the I/O proces- 
sor(s) as I/O requests. At the 1/0 processor(s). the 
transmitted contents of the virtual t/O devices are pro- 
vided to the physical I/O devices. After the requested 
I/O operations are performed, the results of the opera- 
tions, if any. are returned to the computing elements as 
responses to the I/O requests. Typically, the virtual I/O 
devices include a virtual keyboard and a virtual display. 
[0026] The invention also features detecting and diag- 
nosing faults in a computer system that includes at least 
two controllers that are connected to each other and to 
at least two computing elements, and at least two com- 
puting elements that are each connected to at least two 
of the controllers. Each computing element produces 
data and generates a value, such as an error checking 
code, that relates to the data. Each computing element 
then transmits the data, along with its corresponding 
value, to the at least two controllers to which it is con- 
nected. When the controllers receive the data and asso- 
ciated values, they transmit the values to the other 
controllers. Each controller then performs computations 
on the values corresponding to each computing ele- 
ment and the values corresponding to each controller. If 
the results of the computations on the values corre- 
sponding to each controller are equal, and the results of 
the computations on the values corresponding to each 
computing element are equal, then no fault exists, Oth- 
ePA^ise. a fault exists. In some instances, the computa- 
tion may be a simple bit by bit comparison. 



[0027] When a fault exists, fault diagnosis is attempted 
by comparing, for each one of the computing element, 
all of the values corresponding to the one computing 
element. If the values corresponding to each computing 

5 element match for each computing element, but mis- 
match for different computing elements, then one of the 
computing elements is faulty. If the values correspond- 
ing to only one of the computing elements mismatch, 
then a path to that computing element Is faulty If the val- 

10 ues corresponding to multiple computing elements mis- 
match, then tile controller that is connected to the 
mismatching cqniputing elements is faulty. Once identi- 
fied, the faulty element is disabled. 
[0028] A system according to the invention can 

75 restore itself to full capability after a faulty element (that 
is, a CE, an lOP. a storage device, etc.) is repaired. The 
system does so by transferring the state of an active 
element to the repaired element and, thereafter, reena- 
bling the repaired element. Inactive or repaired proces- 

20 sors are activated by transferring the operational state 
of an active processor or processors to the inactive 
processor through a controller. When the inactive proc- 
essor is a computing element, the operational state of 
an active computing element (or elements) is trans- 

25 ferred through a controller. When the inactive processor 
is a controller, the operating state of an active controller 
is directiy transfen-ed. The transfer can occur either 
when system operation is paused or as a background 
process. 

30 [0029] This recovery capability can also be used to 
provide on-line upgrades of hardware, software, or botii 
by causing a processor of the system to fail by, for 
example, turning it off. The upgrade is then performed 
by either replacing or modifying the disabled processor. 

35 The upgraded processor is then turned on and reacti- 
vated as discussed above. 

[0030] The invention also features a single controller, 
dual computing element system in which a controller is 
connected to two computing elements. In this computer 

40 system, I/O operations by the computing elements are 
iritercepted and redirected to the controller. Typically, 
the controller and the two computing elements each 
include^ an industry standard motherboard; and" are 
each able to run unmodified industry standard operating 

45 systems arxi applications. In addition, the controller is 
able to run a first operating system while the computing 
elements simultaneously run a second operating sys- 
tem. 

[0031 ] The single controller system can be expanded 
so to include a second controller connected both to the first 
controller and to the two computing elements. For pur- 
poses of providing limited disaster resilience, the first 
controller and one of the computing elements can be 
placed in a location remote from the second controller 
55 and the other computing element, and can be con- 
nected to the second conf oiler and the otiier computing 
element by a communications link. 
[0032] For improved availabi% and performance, the 
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dual controller, dual computing blement system can be 
connected to an identical second system. The two sys- 
tems then run a distributed computing environment in 
which one of the systems runs a first portion of a first 
application and the other system runs either a second 5 
application or a second portion of the first application. 
[0033] In another embodiment, the invention features 
a computer system that includes three controllers con- 
nected to each other and three computing elements that 
are each connected to different pairs of the three con- 10 
trollers. This system, like the other systems, also fea- 
tures intercepting I/O operations by the computing 
elements and redirecting them to the controllers for 
processing. For disaster resilience, the first controller 
and one of the computing elements are placed in a loca- is 
Won remote from the remaining controllers and comput- 
ing elements, or each controller/computing element pair 
is placed in a different location. 

[0034] A disaster tolerant system is created by con- 
necting at least two of the three controller systems 20 
described above. The three controller systems are 
placed in remote locations and connected by a commu- 
nications link. 

Brief Description of the Drawnnqs 25 
[0035] 

Rg. 1 IS a block diagram of a partially fault resilient 
system. 30 
Fig. 2 is a block diagram of system software of the 
system of Fig. 1 . 

Rg. 3 is a flowchart of a procedure used by an lOP 
Monitor of the system software of Fig. 2. 
Rg. 4 is a block diagram of an IPI module of the 35 
system of Fig. 1 . 

Rg. 5 is a state transition table for the system of Fig. 
1. 

Rg. 6 is a block diagram of a fault resilient system. 
Rg. 7 is a block diagram of a distributed fault resil- 40 
lent system. 

Rg. a is a block diagram of a teult tolerant system. 

Rg. 9 is flowchart of a feult diagnosis procedure 

used by lOPs of the system of Fig. 8. 

Rg. 1 0 is a block diagram of a disaster tolerant sys- 45 

tem. 

Descriptio n of the Preferred Embodiments 

[0036] As illustrated In Fig. 1 . a fault resilient system so 
10 includes an I/O processor ("lOP") 12 and two com- 
puting elements ("CEs") 14a. 14b (collectively refewed 
to as CEs 14). Because system 10 includes only a sin- 
gle lOP 1 2 and therefore cannot recover from a failure in 
lOP 12. system 1 0 is not entirely fault resilient. ss 
[0037] lOP 12 includes two inter-processor intercon- 
nect ("IPI") modules 16a. 16b that are connected, 
respectively, to corresponding IPI modules 18a. 18b of 



CEs 1 4 by cables 20a, 20b lOP 1 2 also includes a proc- 
essor 22. a memory system 24. two hard disk drives 26. 
28, and a power supply 30. Similarly each CE 14 
includes a processor 32, a memory system 34. and a 
power supply 36. Separate power supplies 36 are used 
to ensure fault resilience in the event of a power supply 
failure. Processors 32a, 32b are "identical" to each 
other in that, for every instruction, the number of cycles 
required for processor 32a to perform an instruction is 
identical to the number of cycles required for processor 
32b to perform the same instruction. In tiie illustrated 
embodiment, system 10 has been implemented using 
standard Intel 486 based motherboards for processors 
22, 32 and four megabytes of memory for each of mem- 
ory systems 24, 34. 

[0038] lOP 12 and CEs 14 of system 10 run unmodi- 
fied operating system and applications software, witii 
hard drive 26 being used as the boot disk for the lOP 
and hard drive 28 being used as the kx>ot disk for CEs 
14. In truly fault resilient or fault tolerant systems that 
include at least two lOPs. each hard drive would also be 
duplicated. 

[0039] In the illustrated embodiment, tiie operating 
system for lOP 12 and CEs 14 is DOS. However, other 
operating systems can also be used. Moreover. lOP 12 
can run a different operating system from tiie one run by 
CEs 14. For example, lOP 12 could run Unix while CEs 
14 run DOS. This approach is advantageous because it 
allows CEs 14 to access peripherals from operating 
systems that do not support tiie peripherals. For exam- 
ple, if CEs 1 4 were running an operating system tiiat did 
not support CD-ROM drives, and lOP 12 were running 
one tiiat did, CEs 1 4 could access the CD-ROM drive by 
issuing I/O requests identical to tiiose used to. say. 
access a hard drive. lOP 12 would then handle tiie 
translation of tiie I/O request to one suitable for access- 
ing the CD-ROM drive. 

[0040] Refenring also to Rg. 2. system 10 includes 
specialized system software 40 that controls the booting 
and synchronization of CEs 14. disables local time in 
CEs 1 4, redirects all I/O requests from CEs 14 to lOP 1 2 
for execution, and returns the results of the I/O 
requests, if any. from lOP 12 to CEs 14. 
[0041] System software 40 includes two sets of IPI 
BIOS 42 that are ROM-based and are each located in 
the IPI module 18 of a CE 14. IPI BIOS 42 are used in 
bootup and synchronization activities. When a CE 14 is 
booted, IPI BIOS 42 replaces the I/O interrupt 
addresses in the system BIOS interrupt table witii 
addresses that are controlled by CE Drivers 44. The 
interrupt addresses that are replaced include those cor- 
responding to video services, f ixed disk services, serial 
communications services, keyboard services, and time 
of day services. 

[0042] IPI BIOS 42 also disables namal memory 
refreshing to ensure that memory refreshing, which 
affects the number of cycles during which a CE 14 is 
actually processing, is controlled by system software 
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40. Memory refreshing is required to maintain memory 
integrity. In known refreshing methods, memory is 
refreshed periodically, with one block of memory being 
refreshed at the end of each refresh period. The dura- 
tion of the refresh period is selected so that the entire 5 
memory is refreshed within the memory's refresh limit. 
Thus, for example. If a memory has 256 blocks and an 8 
ms refresh limit, then the refresh period is 31.25 ^s (8 
ms/256). 

[0043] In the described embodiment. IPI BIOS 42 dis- 10 
ables memory refreshing by placing a counter used in 
the Intel 486 motherboard to controtmemory refreshing 
in a mode that requires a gate input to the counter to 
change in order to irx:rement. Because the gate input is 
typically connected to the power supply, the gate input is 
never changes and the counter is effectively disabled. 
[0044] Two CE Drivers 44 of system software 40 han- 
dle memory refreshing by burst refreshing multiple 
blocks of memory each time that an I/O request or 
quantum interrupt is generated. CE Drivers 44 are 20 
stored on CE boot disk 28 and are run by CEs 14. In 
addition to performing burst memory refreshes. CE 
Drivers 44 intercept I/O requests to the system BIOS 
and redirects them through IPI modules 18 to lOP 12 for 
execution. CE Drivers 44 also respond to inten-upt 2S 
requests from IPI modules 18, disable the system clock, 
and. based on information supplied by lOP Monitor 48. 
control the time of day of CEs 14. 
[0045] An lOP Driver 46 that is located on lOP boot 
disk 26 and is run by lOP 12 handles I/O requests from 30 
CEs 14 by redirecting them to an lOP Morutor 48 for 
processing and transmitting the results from lOP Moni- 
tor 48 to CEs 14. lOP Driver 46 communicates with CE 
drivers 44 using a packet protocol. 

[0046] lOP Monitor 48 is located on lOP boot disk 26 35 
and is run by lOP 12. lOP Monitor 48 controls system 10 
and performs the actual I/O requests to produce the 
results that are transmitted by lOP Driver 46 to CEs 14. 
[0047] System software 40 also includes console soft- 
ware 49 that runs on lOP 12 and provides for user con- 40 
trol of system 10. Using console software 49, a user can 
reset, boot, or synchronize a CE 14. The user can also 
set one or both of CEs 14 to automatically boot (auto^ 
boot) and/or automatically synchronize (autosync) after 
being reset or upon startup. The ability to control each 45 
CE 14 is useful both during normal operation and for 
test purposes. Using console software 49. the user can 
also place system 10 into either an Integrity mode in 
which lOP Monitor 48 shuts down both CEs 14 when 
faced with a miscompare error, a first availability mode so 
in which lOP Monitor 48 disables CE 14a when faced 
with a miscompare error, or a second availability mode 
in which lOP Monitor 48 disables CE 14b when faced 
with a miscompare en-or. Finally, console software 49 
allows the user to request the status of system 1 0. In an ss 
alternative embodiment, console software 49 could be 
implemented using a separate processor that communi- 
cates with lOP 12. 
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[0048] Each CE 14 runs a copy of the same applica- 
tion and the same operating system as that run by the 
other CE 14. Moreover, the contents of memory sys- 
tems 34a and 34b are the same, and the operating con- 
text of CEs 14 are the same at each synchronization 
time. Thus. lOP Monitor 48 should receive Identical 
sequences of I/O requests from CEs 14. 
[0049] As shown in Fig. 3, lOP Monitor 4S processes 
and monitors I/O requests according to a procedure 
100. Initially. lOP Monitor 48 waits for an I/O request 
from one of CEs 14 (step 102). Upon receiving an I/O 
request packet from, for example. CE 14b. LOP Monitor 
48 waits tor either an I/O request from CE 1 4a or for the 
expiration of a timeout period (step 104). Because sys- 
tem 10 uses the DOS operating system, which halts 
execution of an application while an I/O request is being 
processed, lOP Monitor 48 is guaranteed not to receive 
an I/O request from CE 14b while waiting (step 104) for 
the I/O request from the CE 14a. 
[0050] Next. lOP Monitor 48 checks to determine 
whether the timeout period has expired (step 106). If not 
(that is, an I/O request packet from CE 14a has arrived), 
lOP Monitor 48 compares the checksums of the packets 
(step 108), and, if the checksums are equal, processes 
the I/O request (step 110). After processing the I/O 
request, lOP Monitor 48 issues a request to the system 
BIOS of iOP 12 for the cun^ent time of day (step 1 12). 
[0051 ] After receiving the time of day. IOP Monitor 48 
assembles an IPI packet that includes the time of day 
and the results, if any, of the I/O request (step 114) and 
sends the IPI packet to IOP Driver 46 (step 116) for 
transmission to CEs 14. When CEs 14 receive the IPI 
packet, they use the transmitted time of day to update 
their local clocks which, as already noted, are othenA^ise 
disabled. 

[0052] As required by DOS, execution in CEs 14 is 
suspended until lOP Monitor 48 returns the results of 
the I/O request through IOP Driver 46. Because, before 
execution Is resumed, the times of day of both CEs 14 
are updated to a common value (the transmitted time of 
day from the IPI packet), the CEs 14 are kept in time 
synchronization with the transmitted time of day being 
designated the meta time: If a multitasking operating 
system were employed, execution in CEs 14 would not 
be suspended while IOP Monitor 48 performed the I/O 
request. Instead, processing in CEs 14 would be sus- 
pended only until receipt of an acknowledgement indi- 
cating that IOP Monitor 48 has begun processing the 
1/0 request (step 110). The acknowledgement wouki 
include the time of day and would be used by CEs 1 4 to 
update the local clocks. 

[0053] After sending the IPI packet to IOP Driver 46, 
IOP Monitor 48 verifies that both of CEs 14 are online 
(step 118), and. if so. waits for another I/O request from 
one of CEs 14 (step 102). 

[0054] If the timeout period has expired (step 106), 
lOP Monitor 48 disables the CE 1 4 that failed to respond 
(step 119) and processes the I/O request (step 1 1 0). 
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[0055] If there is a miscompare between the check- 
sums of the packets from CEs 14 (step 108), lOP Mon- 
itor 48 checks to see H system 10 is operating in an 
availability mode or an Integrity mode (step 120). 
[0056] If system 10 is operating in an availability 
mode. lOP Monitor 48 disables the appropriate CE 14 
based on the selected availability mode (step 122), and 
processes the I/O request (step 110). Thereafter, when 
lOP Monitor 48 checks whether both CEs 14 are online 
(step 118), and assuming that the disabled CE 14 has 
not been repaired and reactivated, lOP Monitor 48 then 
waits for an J/O request from. the. onJine.CE 14 (step 
124). Because system 10 is no longer fault resilient, 
when an I/O request is received, lOP Monitor 48 imme- 
diately processes the I/O request (step 110). 
[0057] If system 10 is operating in an integrity mode 
when a miscompare is detected, lOP Monitor 48 disa- 
bles both CEs 14 (step 126) and stops processing (step 
128). 

[0058] Referring again to Figs. 1 and 2, when the 
application or the operating system of, for example. CE 
14a makes a non-l/O call to the system BIOS, the sys- 
tem BIOS executes the request and returns the results 
to the application without invoking system software 40. 
However, if the application or the operating system 
makes an I/O BIOS call. CE Driver 44a intercepts the 
I/O request. After intercepting the I/O request, CE Driver 
44a packages the I/O request into an IPI packet and 
transmits the IPt packet to lOP 12. 
[0059] When IPI module 16a of lOP 12 detects trans- 
mission of an IPI packet from CE 14a, IPI module 16a 
generates an interrupt to lOP Driver 16. lOP Driver 46 
then reads the IPI packet. 

[0060] As discussed above, lOP Monitor 48 responds 
to the IPI packet from CE 14a according to procedure 
100. As also discussed, assuming that there are no 
hardware faults, lOP Driver 46 eventually transmits an 
IPI packet that contains the results of the I/O request 
and the time of day to CEs 1 4. 
[0061] IPI modules 18 of CEs 14 receive the IPI 
packet from lOP 12. CE Drivers 44 unpack the IPI 
packet, update the time of day of CEs 14, and return 
control of CEs 1 4 to the application or the operating sys-- 
tern running on CEs 14. 

[0062] If no I/O requests are issued within a given time 
interval, the IPI nrxxJule 18 of a CE 14 generates a so- 
called quantum interrupt that invokes the CE Driver 44 
of the CE 14. In response, the CE Driver 44 creates a 
quantum interrupt IPI packet and transmits it to lOP 12. 
lOP Monitor 48 treats the quantum interrupt IPI packet 
as an IPI packet without an I/O request. Thus. lOP Mon- 
itor 48 detects the incoming quantum interrupt IPI 
packet (step 102 of Fig. 3) and. if a matching quantum 
interrupt IPI packet is received from the other CE 14 
(steps 104. 106. and 108 of Fig. 3). issues a request to 
the system BIOS of lOP 12 for the cun-ent time of day 
(step 1 12 of Fig. 3). lOP Monitor 48 then packages the 
current time of day into a quantum response IPI packet 



(step 114 of Fig. 3) that lOP Driver 46 then sends to 
CEs 14 (step 116 of Fig. 3). CE Drivers 44 respond to 
the quantum response IPI packet by updating the time 
of day and returning control of CEs 1 4 to the application 

5 or the operating system running on CEs 14. 

[0063] If lOP Monitor 48 does not receive a quantum 
interrupt IPI package from the other CE 14 within a pre- 
defined timeout period (step 106 of Fig. 3). lOP Monitor 
48 responds by disabling the non-responding CE 14. 

10 [0064] As shown in Fig. 1. IPI modules 16, 18 and 
cables 20 provide all of the hardware necessary to pro- 
duce a fault resilient system from the.standard Intel 486 
based motherboards used to implement processors 22, 
32. An IPf module 16 and an IPI module 18. which are 

75 implemented using identical boards, each perform simi- 
lar functions. 

[0065] As illustrated in Fig. 4, an IPI module 18 
includes a control logic 50 that communicates I/O 
■"equests and responses between the system bus of a 

20 processor 32 of a CE 14 and a parallel interface 52 of 
IPI module 18. Parallel interface 52, in turn, communi- 
cates with the parallel interface of an IPI module 16 
through a cable 20. Parallel interface 52 includes a six- 
teen bit data output port 54, a sixteen bit data input port 

25 56. and a control port 58. Cable 20 is configured so that 
data output port 54 is connected to the data input port of 
the IPI module 1 6. data input port 56 is connected to the 
data output port of the IPI module 16. and control port 
58 is connected to the control port of the IPI module 16. 

30 Control port 58 implements a handshaking protocol 
between IPI module 18 and the IPI module 16. 
[0066] Control logic 50 is also connected to an IPI 
BIOS ROM 60. At startup, control logic 50 transfers IPI 
BIOS 42 (Fig. 2), the contents of IPI BIOS ROM 60, to 

35 processor 32 through the system bus of processor 32. 
[0067] A Ql counter 62. also located on IPI module 1 8. 
generates quantum interrupts as discussed above. Ql 
counter 62 includes a ctock input 64 that is connected to 
the system clock of processor 32 and a gate input 66 

40 that is connected to control logic 50. Gate input 66 is 
used to activate and reset the counter value of Ql coun- 
ter 62. When activated. Ql counter 62 decrements the 
counter value by one during- each cyde of the system 
clock of processor 32. When the counter value reaches 

45 zero. Ql counter 62 generates a quantum inten-upt that, 
as discussed above, activates CE Driver 44 (Fig. 2). 
[0068] CE Driver 44 deactivates Ql counter 62 at the 
beginning of each I/O transaction. CE Driver 44 deacti- 
vates Ql counter 62 by requesting an I/O write at a first 

so address, known as the Ql deactivation address. Control 
logic 50 detects the I/O write request and deactivates 
Ql counter 62 through gate input 66. Because this par- 
ticular I/O write is for control purposes only, control logic 
50 does not pass the I/O write to parallel interface 52. At 

55 the conclusion of each I/O transaction. CE Driver 44 
resets and activates Ql counter 62 by requesting an i/0 
write to a second address, known as the Ql activation 
address. Control logic 50 responds by resetting and 
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activating Ql counter '62. 

[0069] In an alternative approach, quantum interrupts 
are generated through use of debugging or other fea- 
tures available in processor 32. Some commonly availa- 
ble processors include debugging or trap instructions 
that trap en-ors by transferring control of the processor 
to a designated program after the completion of a 
selected number of instructions following the trap 
instruction. In this approach, each time that CE Driver 
44 returns control of processor 32 to the application or 
operating system, CE Driver 44 issues a trap instruction 
to indicate that controLof prgcessor 32 should.be given 
to CE Driver 44 upon completion of. tor example. 300 
Instructions. After processor 32 completes the indicated 
300 instructions, the trap instruction causes control of 
processor 32 to be returned to CE Driver 44. In the 
event that an I/O request activates CE Driver 44 prior to 
completion of the indicated number of instructions, CE 
Driver 44 issues an instruction that cancels the trap 
instruction. 

[0070] I PI Module 18 is also used in activating an 
offline CE 14. As discussed below, before an offline CE 
14 is activated, the contents of the memory system 34 
of the active CE 14 are copied into the memory system 
34 of the offline CE 14. To minimize the effects of this 
copying on the active CE 14, the processor 32 of the 
active CE 14 is permitted to continue processing and 
the memory is copied only during cycles in which the 
system bus of the processor 32 of the active CE 14 is 
not in use. 

[0071 ] To enable processor 32 to continue processing 
while the memory is being copied. IPI module 18 
accounts for memory writes by the processor 32 to 
addresses that have already been copied to the offline 
CE 14. To do so. control logic 50 monitors the system 
bus and, when the processor 32 writes to a memory 
address that has already been copied, stores the 
address in a FIFO 68. When the memory transfer is 
complete, or when FIFO 68 is full, the contents of mem- 
ory locations associated with the memory addresses 
stored in FIFO 68 are copied to the offline CE 14 and 
FIFO 68 is emptied. In other approaches. FIFO 68 is 
modified to store both memory addresses and the con- 
tents of memory locations associated with the 
addresses, or to store the block addresses of memory 
blocks to which memory addresses being written 
belong. 

[0072] IPI module 18 also handles non-BIOS I/O 
requests. In some computer systems, the BIOS is too 
slow to effectively perform i/O operations such as video 
display. As a result, some less structured or less disci- 
plined operating systems, such as DOS or UNIX, allow 
applications to circumvent the BIOS and make non- 
BIOS I/O requests by directly reading from or writing to 
the addresses associated with I/O devices. These non- 
BIOS I/O requests, which cannot be intercepted by 
changing the system interrupt table, as is done in con- 
nection with, for example. I/O disk reads and writes, are 
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problematic for a system in which synchronization 
requires tight control of the I/O interface. 
[0073] To remedy this problem, and to assure that 
even non-BIOS I/O requests can be isolated and man- 

5 aged by lOP 12, IPI module 18 includes virtual I/O 
devices that mimic the hardware interfaces of physical 
I/O devices. These virtual I/O devices include a virtual 
display 70 and a virtual keytx)ard 72. As needed, other 
virtual MO devices such as a virtual mouse or virtual 

10 serial and parallel ports could also be used. 

[0074] In practice, control logic 50 monitors the sys- 
tem bus for read or .. write operations directed to. 
addresses associated with non-BIOS I/O requests to 
system I/O devices. When control logic 50 detects such 

15 an operation, control logic 50 stores the information 
necessary to reconstruct the operation in the appropri- 
ate virtual device. Thus, for example, when control logic 
50 detects a write operation directed to an address 
associated with the display, control logic 50 stores the 

20 information necessary to reconstruct the operation in 
virtual display 70. Each time that a BIOS I/O request or 
a quantum interrupt occurs, CE Driver 44 scans the vir- 
tual I/O devices and, if the virtual devices are not empty, 
assembles the information stored in the virtual devices 

25 into an IPI packet and transmits the IPI packet to lOP 
12. lOP 12 treats the packet like a BIOS I/O request 
using procedure 100 discussed above. When control 
logic 50 detects a read addressed to a virtual I/O device, 
control logic 50 assembles the read request into an IPI 

30 packet for handling by lOP 12. lOP 12 treats the IPI 
packet like a standard BIOS I/O request. 
[0075] Referring to Fig. 5. each CE 14 always oper- 
ates in one of eight states and. because there are only 
a limited number of permissible state combinations, 

35 system 10 always operates in one of fourteen states. 
The major CE operating states are OFFLINE. RTB 
(ready to boot). BOOTING. ACTIVE. RTS (ready to 
sync), WAITING, M.SYNC, (synchronizing as master), 
and S.SYNC (synchronizing as slave). lOP Monitor 48 

40 changes the operating states of CEs 14 based on the 
state of system 10 and user commands from console 
software 49. Through console software 49, a user can 
reset a CE 14 at any time. Whenever the user resets a 
CE 14. or a fault occurs in the CE 14. lOP Monitor 48 

45 Changes the state of the CE 1 4 to OFFLINE. 

[0076] At startup, system 10 is operating with both 
CEs 14 OFFLINE (state 150). System 10 operates in 
the upper states of Fig. 5 (states 1 52-1 62) when CE 1 4a 
becomes operational before CE 14b and in the lower 

so states (states 166-176) when CE 14b is the first to 
become operational. If CEs 14 become operational 
simultaneously, the first operational CE 14 to be recog- 
nized by lOP Monitor 48 is treated as the first to become 
operational. 

55 [0077] When a CE 1 4 irxiicates that it is ready to boot 
by issuing a boot request, the state of the CE 14 
changes to RTB if the CE 1 4 is not set to autoboot or to 
BOOTING if the CE 14 is set to autoboot. For example. 
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if CE 14a issues a boot request when both CEs 14 are 
OFFLINE, and CE 14a Is not set to autoboot. then the 
state of CE I4a changes to RTB (state 152). Thereafter, 
lOP Monitor 48 waits for the user, through console soft- 
ware 49. to boot CE 14a. When the user boots CE 14a, § 
the state of CE 14a changes to BOOTING (state 154). If 
the user resets CE 1 4a. the state of CE 1 4a changes to 
OFFLINE (state 150). 

[0078] If both CEs 14 are OFFLINE when CE 14a 
issues a boot request, and CE 14a is set to autotX)Ot. io 
the state of CE I4a changes to BOOTING (state 154). If 
CE 14a boots successfully, the state of CE 14a changes 
to ACTIVE (state 156). 

[0079] When CE 14a is ACTIVE, and CE 14b issues 
a kx)ot request, or if CE 14b had issued a boot request is 
while the state of CE 14a was transitioning from 
OFFLINE to ACTIVE (states 152-156), the state of CE 
1 4b changes to RTS (state 1 58) if CE 1 4b is set to auto- 
sync and othennrise to WAITING (state 160). If the state 
of CE 14b changes to RTS (state 158), lOP Monitor so 
waits for the user to issue a synchronize command to 
CE 14b. When the user issues such a command, the 
state of CE 14b changes to WAITING (state 160). 
[0080] Once CE 1 4b is WAITING. lOP Monitor 48 cop- 
ies the contents of memory system 34a of CE 14a into 25 
memory system 34b of CE 14b. Once the memory 
transfer is complete. lOP Monitor 48 waits for CE 14a to 
transmit a quantum interrupt or I/O request IPI packet. 
Upon receipt of such a packet. lOP Monitor 48 changes 
the state of CE 1 4a to M.SYNC and the state of CE 1 4b so 
to S.SYNC (state 162). and synchronizes the CEs 14. 
This synchronization includes responding to any mem- 
ory changes that occurred while lOP Monitor 48 was 
waiting for CE 14a to transmit a quantum interrupt or I/O 
request IPI packet. Upon completion of the synchroni- 35 
zation, the states of the CEs 14 both change to ACTIVE 
(state 164) and system 10 is deemed to be fully opera- 
tional. 

[0081] In an alternative implementation. lOP Monitor 
48 does not wait for memory transfer to complete before 40 
changing the state of CE 14a to M.SYNC and the state 
of CE 14b to S.SYNC (state 162). Instead. lOP Monitor 
48 makes this state change upon receipt of an IPI 
packet from CE 14a and performs the memory transfer 
as part of the synchronization process. 45 
[0082] Similar state transitions occur when CE 14b is 
the first CE 14 to issue a boot request. Thus, assuming 
that CE 14b is not set to autoboot CE 14b transitions 
from OFFLINE (state 150) to RTC (state 166) to BOOT- 
ING (state 168) to ACTIVE (state 170). Similarly, once so 
CE 14b is ACTIVE, and assuming that CE 14a is not set 
to autosync. CE 14a transitions from OFFLINE (state 
170) to RTS (state 172) to WAITING (state 174) to S. 
SYNC (state 176) to ACTIVE (state 164). 
[0083] In other embodiments of the invention, for ss 
example, referring to Fig. 6. a fault resilient system 200 
Includes two lOPs 202 and two CEs 204. Each CE 204 
is connected, through an IPI card 206 and a cable 208. 



to an IPI card 210 of each lOP 202. lOPs 202 are redun- 
dantly connected to each other through IPI cards 210 
and cables 212. Because every component of system 
200 has a redundant backup component, system 200 is 
entirely fault resilient. In an alternative approach, cables 
208 and 210 could be replaced by a pair of local area 
networks to which each lOP 202 and CE 204 would be 
connected. Indeed, local area networks can always be 
substituted for cable connections. 
[0084] System 200 is operating system and applica- 
tion software independent in that it does not require 
mpdifications of the opei:ating system or the. application 
software to operate. Any single piece of hardware can 
be upgraded or repaired in system 200 with no service 
interruption. Therefore, by sequentially replacing each 
piece of hardware and allowing system 200 to resyn- 
chronize after each replacement, the hardware of sys- 
tem 200 can be replaced in its entirety without service 
interruption. Similarly, software on system 200 can be 
upgraded with minimal service interruption (that is, dur- 
ing the software upgrade, the application will become 
unavailable for an acceptable period of time such as two 
seconds). Also, disaster tolerance for purposes of avail- 
ability can be obtained by placing each lOP/CE pair in a 
separate location and connecting the pairs through a 
communications link. 

[0085] Referring to Fig. 7. a distributed, high perform- 
ance, fault resijient system 220 includes two systems 
200. the lOPs 202 of which are connected to each other, 
through IPI modules, by cables 222. System 220 uses 
distributed computing environment software to achieve 
high performance by running separate portions of an 
applrcation on each system 200. System 220 is fault tol- 
erant and offers the ability to perform both hardware and 
software upgrades without service interruption. 
[0086] Referring to Fig. 8. a fault tolerant system 230 
includes three lOPs (232, 234, and 236) and three CEs 
(238. 240, and 242). Through IPI modules 244 and 
cables 246. each lOP is connected to an IPI module 244 
of each of the other lOPs. Through IPI modules 248 and 
cables 250, each CE is connected to an iPI module 244 
of two of the lOPs, with CE 238 being connected to 
IOPS 232 and 234, CE 240 being connected to lOPs 
232 and 236, and CE 242 being connected to lOPs 234 
and 236. Like system 200. system 230 allows for hard- 
ware upgrades without service intenruption and soft- 
ware upgrades with only minimal service interruption. 
[0087] As can be seen from a comparison of Figs. 7 
and 8. the CEs and lOPs of systems 200 and 230 are 
Identically configured. As a result, upgrading a fault 
resilient system 200 to a fault tolerant system 230 does 
not require any replacement of existing hardware and 
entails the simple procedure of adding an additional 
CE/lOP pair, connecting the cables, and making appro- 
priate changes to the system software. This modularity 
is an important feature of the paired modular redundant 
architecture of the invention. 

[0088] Because the components of system 230 are tri- 
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ply redundant, system 230 is more capable of identify- 
ing the source of a hardware fault than is system 10. 
Thus, while system 10 simply disables one or both of 
CEs 14 when an error is detected, system 230 offers a 
higher degree of fault diagnosis. 
[0089] Referring to Fig. 9. each lOP (232, 234, 236) of 
system 230 performs fault diagnosis according to a pro- 
cedure 300. Initially, each lOP (232. 234, 236) checks 
for major feiults such as power loss, broken cables, and 
nonfunctional GEs or lOPs using well known techniques 
such as power sensing, cable sensing, and protocol 
timeouts (step 302). When such^a^^ult is detected., 
each lOP disables the faulty device or, if necessary, the 
entire system. 

[0090] After checking for major faults, each lOP waits 
to receive IPI packets (that is, quantum interrupts or I/O 
requests) from the two CEs to which the lOP is con- 
nected (step 304). Thus, for example. lOP 232 waits to 
receive IPI packets from CEs 238 and 240. After receiv- 
ing IPI packets from both connected CEs. each lOP 
transmits the checksums ("CRCs") of those IPI packets 
to the other two iOPs and waits for receipt of CRCs from 
the other two IOPs (step 306). 

[0091] After receiving the CRCs from the other two 
IOPs, each lOP generates a three by three matrix in 
which each column corresponds to a CE, each row cor- 
responds to an lOP. and each entry is the CRC received 
from the column's CE by the row's lOP (step 308). Thus, 
for example, lOP 232 generates the following matrix: 





> CE238 


CE 240 


CE 242 


lOP 232 


■ CRC 


CRC 


X 


lOP 234 


CRC 


X 


CRC 


lOP 236 


: X 


CRC 


CRC 



After generating the matrix. lOP 232 sums the entries in 
each row and each column of the matrix. If the three row 
sums are equal and the three column sums are equal 
(step 310); then there is' no fault 'and'' KDP' 232' checks 
again for major faults (step 302). 
[0092] If either the three rows* sums or the three col- 
umns* sums are unequal (step 310). then lOP 232 com- 
pares the CRC entries in each of the columns of the 
matrix. If the two CRC entries in each column match 
(step 312), then lOP 232 diagnoses that a CE failure 
has occurred and disables the CE corresponding to the 
column for which the sum does not equal the sums of 
the other columns (step 314). 

[0093] If the CRC entries in one or more of the matrix 
columns do not match (step 312). then lOP 232 deter- 
mines how many of the columns include mismatched 
entries. If the matrix includes only one column with mis- 
matched entries (step 315). then lOP 232 diagnoses 
that the path between the lOP corresponding to the 
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matrix row sum that is unequal to the other matrix row 
sums and the CE corresponding to the column having 
mismatched entries has failed and disables that path 
(step 316). For purposes of the diagnosis, the path 
includes the IPI module 244 in the lOP. the IPI module 
248 in the CE. and the cable 250. 
[0094] If the matrix includes more than one column 
with mismatched entries (step 314), then lOP 232 con- 
firms that one matrix row sum is unequal to the other 
matrix row sums, diagnoses an lOP failure, and disa- 
bles the lOP corresponding to the matrix row sum that is 
unequal to the other matrix row sums (sjtep aiSi. 
[0095] If. after diagnosing and accounting for a CE fail- 
ure (step 314), path teilure (step 316). or lOP failure 
(step 318), lOP 232 determines that system 300 still 
includes sufficient non-faulty hardware to remain opera- 
tional. lOP 232 checks again for major faults (step 302). 
Because system 230 is triply redundant, system 230 
can continue to operate even after several components 
have failed. For example, to remain operating in an 
availability mode, system 230 only needs to have a sin- 
gle functional CE, a single functional lOP, and a func- 
tional path between the two. 

[0096] Using procedure 300. each lOP (232. 234. 
236) can correctly diagnose any single failure in a fully 
operational system 230 or in a system 230 in which one 
element (that is. a CE. an lOP, or a path) has previously 
been disabled. In a system 230 in which an element has 
been disabled, each top accounts for CRCs that are not 
received t)ecause of the disabled element by using val- 
ues that appear to be correct in comparison to' actually 
received CRCs. 

[0097] Procedure 300 is not dependent on the partic- 
ular arrangement of interconnections between the CEs 
and IOPs. To operate properly, procedure 300 only 
requires that the output of each CE be directly moni- 
tored by at least two IOPs. Thus, procedure 300 could 
be implemented in a system using any interconnect 
mechanism and does not require point to point connec- 
tions between the CEs and IOPs. For example, the CEs 
and IOPs could be connected to at least two local area 
networks. In an alternative approach, instead of sunning 
the CRC values in the* rows and columns of the matrix, 
these values can be compared and those rows or col- 
umns in which the entries do not match can be marked 
with a match/mismatch indicator 
[0098] A simplified version of procedure 300 can be 
implemented for use in a system 200. In this procedure, 
each lOP 202 of system 200 generates a two by two 
matrix in which each column corresponds to a CE 204 
and each row corresponds to a lOP 202: 
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(continued) 



; CE204 


CE204 


IOP202 ; CRC 


CRC 



After generating the matrix, each lOP 202 attaches a 
mismatch indicator to each row or column In which the 
two entries are mismatched. 

[0099] If there are no mismatch indicators, then sys- 
tem 200 is operating correctly. io 
[01 00] If neither row and both columns have mismatch 
indicators, then an lOP 202 has faulted. Depending on 
the operating mode of system 200. an lOP 202 either 
disables another lOP 202 or shuts down system 200. 
The lOP 202 to be disabled Is selected based on user is 
supplied parameters similar to the two availability 
modes used in system 10. 

[01 01 ] If both rows and neither column have mismatch 
indicators, then a CE 204 has faulted. In this case, lOPs 
202 respond by disabling a CE 204 if system 200 is 20 
operating in an availability mode or, if system 200 is 
operating in an integrity mode, shutting down system 
200. If both rows and one column have mismatch indi- 
cators, then one of the paths between the lOPs 202 and 
the CE 204 corresponding to the mismatched column 2S 
has failed. Depending on the operating mode of system 
200. lOPs 202 either disable the CE 204 having the 
failed path or shut down system 200. If both rows and 
both column have mismatch indicators, then multiple 
faults exist and lOPs 202 shut down system 200. 30 
[0102] If one row and both columns have mismatch 
indicators, then the lOP 202 corresponding to the mis- 
matched row has faulted. Depending on the operating 
mode of system 200. the other lOP 202 either disables 
the faulty lOP 202 or shuts down system 200. If one row 35 
and one column have mismatch indicators, then the 
path between the lOP 202 corresponding to the mis- 
matched row and the CE 204 corresponding to the mis- 
matched column has failed. Depending on the operating 
mode of system 200, lOPs 202 either account for the 40 
failed path in future processing or shut down system 
200. 

[0103] Referring to Rg. 10. one embodiment of a dis- 
aster tolerant system 260 includes two fault tolerant sys- 
tems 230 located in remote locations and connected by 45 
communications link 262. such as Ethernet orfiber. and 
operating in meta time lockstep with each other. To 
obtain meta time lockstep. all IPI packets are transmit- 
ted between fault tolerant systems 230. Like system 
220. system 260 allows for hardware and software so 
upgrades without service interruption. 
[01 04] As shown, the paired modular redundant archi- 
tecture of the invention allows for varying levels of fault 
resilience and fault tolerance through use of CEs that 
operate asynchronously in real time and are controlled ss 
by lOPs to operate synchronously in meta time. This 
architecture is simple and cost-effective, and can be 
expanded or upgraded with minimal difficulty. 



Claims 

1 . A method of producing a fault resilient or feult toler- 
ant computer, comprising the steps of : 

designating a first processor as a computing 
element; 

designating a second processor as a control- 
ler; 

connecting the computing element and the 
controller to produce a modular pair; 
connecting at least Jwo modular pairs to pro-, 
duce a fault resilient or fault tolerant computer, 
wherein each computing element performs all 
instructions in the same number of cycles as 
the other computing elements. 

2. The method of daim 1 , wherei n the first and second 
processors are industry standard processors. 

3. The method of claim 1 , further including the step of 
running industry standard operating systems and 
applications on the at least two controllers and the 
at least two computing elements. 

4. The method of claim 1. further including the steps 
of: 

running a first operating system on the at least 
two controllers; and 

running a second operating system on the at 
least two computing elements. 

5. The method of claim 1. further comprising the step 
of locating a modular pair remotely from the one or 
more other modular pairs to provide disaster toler- 
ance. 

6. The method of claim 1 . further conrprtsing the steps 
of: 

connecting a first I/O device to a first modular 

pair; 

connecting a second I/O device to a second 
modulcu- pair, said second I/O device being 
redundant of the first I/O device; and 
transmitting at least identical I/O write requests 
and data to the first and second I/O devices. 

7. The method of claim 6. further comprising the steps 
off: 

connecting a third I/O device to a third modular 
pair, said third I/O device being redundant of 
the first and second I/O devices; and 
transmitting at least identical I/O write requests 
and data to the first, second, and third I/O 
devices. 
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8. The method of claim 1, further comprising the step 
of activating an inactive processor by transferring 
the operational state of an active processor to the 
inactive processor through a controller. 

5 

9. The method of claim 8, further comprising the step 
of pausing processing by said computing elements 
during said transferring step. 

10. The method of claim 8. further comprising the step io 
of performing said transferring step as a back- 
ground process without pa^usi^gprocessing^bjc^d 
computing elements. 

11. The method of claim 1, further comprising the step is 
of upgrading a processor while said computing ele- 
ments are processing by: 

disabling a processor to be upgraded; 
upgrading the disabled processor; and so 
reactivating the upgraded processor by trans- 
ferring the operational state of an active proc- 
essor to the upgraded processor through a 
controller. 

25 

12. The method of claim 1. further comprising the step 
of repairing a processor while said computing ele- 
ments are processing by: 

disabling a processor to be repaired ; 30 
repairing the disabled processor; and 
reactivating the repaired processor by transfer- 
ring the operational state of an active proces- 
sor to the repaired processor through a 
controller. 35 

13. A method of detecting and diagnosing faults in a 
computer system that includes at least two comput- 
ing elements and at least two controllers, wherein 
each of the computing elements is connected to at 40 
least two of the controllers, and each controller Is 
connected to at least two computing elements and 

to th^othercontpollers? said^method'compristng the 

steps of: 

45 

producing data at each of the computing ele- 
ments; 

generating a value at each of the computing 
elements that relates to the produced data; 
transmitting the data, along with the corre- so 
sponding values, from each computing ele- 
ment to the at least two connected controllers; 
transmitting the values received by each con- 
troller to the other controllers; and 
performing computations on the values corre- ss 
sponding to each computing element and the 
values corresponding to each controller; 
wherein, when the results of the computations 
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performed on the values corresponding to each 
controller are equal, and the results of the com- 
putations performed on the values correspond- 
ing to each computing element are equal, no 
faults exist. 

14. The method of claim 13. further comprising, when 
the results of the computations performed on the 
values corresponding to each computing element 
and the results of the computations performed on 
the values corresponding to each controller are not 
equal, the steps ofi. . . 

comparing, for each one of the computing ele- 
ments, all of the values corresponding to the 
one computing element, and 
designating one of the conputing elements as 
faulty when the values con^esponding to each 
computing element match for each computing 
element, but mismatch for different computing 
elements. 

15. The method of claim 13, further comprising; when 
the results of the computations performed on the 
values corresponding to each computing element 
and the results of the computations performed on 
the values con-esponding to each controller are not 
equal, the steps of: 

comparing, for each one of the computing ele- 
ments, all of the values corresponding to the 
one computing element, and 
designating a connection to one of the comput- 
ing elements as faulty when the values corre- 
sponding only to the one computing element 
mismatch. 

16. The method of claim 13, further comprising, when 
the results of the computations performed on the 
values corresponding to each computing element 
and the results of the computations performed on 
the values corresponding to each controller are not 
equal; the steps^of:"' 

comparing, for each one of the computing ele- 
ments, all of the values corresponding to the 
one computing element, and 
when the values corresponding to two or more 
of the computing elements mismatch, desig- 
nating the controller connected to the two or 
more computing elements as faulty. 
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processor as a computing element (204). designating a 
second processor (202) as a controller, connecting the 
computing element (204) and the controller (202) to pro- 
duce a modular pair, and connecting at least two modu- 
lar pairs to produce a fault resilient or fault tolerant 
computer (200). Each computing element (202. 204) of 
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